Packing/Unpacking Information Generation for Efficient Generalized kr→r and r→kr Array Redistribution
نویسندگان
چکیده
Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. Since it is performed at run-time, there is a performance tradeoff between the efficiency of new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present efficient methods to generate the packing/unpacking information for BOLCK-CYCLIC(kr) to BLOCK-CYCLIC(r) and BOLCK-CYCLIC(r) to BLOCK-CYCLIC(kr) redistribution with arbitrary source/destination processor sets. The most significant improvement of this paper is that a processor does not need to construct the send/receive data sets for a redistribution. Based on the packing/unpacking information derived from kr→r and r→kr redistributions, a processor can pack/unpack array elements into (from) messages directly. To evaluate the performance of our methods, we have implemented our methods along with the PITFALLS method and the Prylli’s method on an IBM SP2 parallel machine. The experimental results show that our algorithms outperform the PITFALLS method and the Prylli’s method for all test samples.
منابع مشابه
Efficient Methods for kr R r and r R kr Array
Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. Since it is performed at run-time, there is a performance tradeoff between the efficiency of new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present efficient algorithms for...
متن کاملA Generalized Basic Cycle Calculation Method for Efficient Array Redistribution
ÐIn many scientific applications, dynamic array redistribution is usually required to enhance the performance of an algorithm. In this paper, we present a generalized basic-cycle calculation (GBCC) method to efficiently perform a BLOCK-CYCLIC(s) over P processors to BLOCK-CYCLIC(t) over Q processors array redistribution. In the GBCC method, a processor first computes the source/destination proc...
متن کاملMessage Encoding Techniques for Efficient Arrary Redistribution
In this paper, we present message encoding techniques to improve the performance of BLOCK-CYCLlC(kr) to BLOCK-CYCLIC(r) {and vice versa) array ’ redistribution algorithms. The message encoding techniques are machine independent and could be used with different algorithms. By incorporating the techniques in array redistribution algorithms, one can reduce the computation overheads and improve the...
متن کاملA Generalized Processor Mapping Technique for Array Redistribution
ÐIn many scientific applications, array redistribution is usually required to enhance data locality and reduce remote memory access in many parallel programs on distributed memory multicomputers. Since the redistribution is performed at runtime, there is a performance trade-off between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of redistribu...
متن کاملApproximation algorithms and hardness results for the clique packing problem
For a fixed family F of graphs, an F -packing in a graphG is a set of pairwise vertex-disjoint subgraphs of G, each isomorphic to an element of F . Finding an F -packing that maximizes the number of covered edges is a natural generalization of the maximum matching problem, which is just F = {K2}. In this paper we provide new approximation algorithms and hardness results for the Kr-packing probl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999